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1.  INTRODUCTION 


Two  new  estimation  methods  for  parameters  in  log- 
linear  models  are  introduced  in  this  paper.  One  of  these 
particularly  provides  easy  calculation  and  both  are  shown  to 
have  appropriate  asymptotic  properties.  Existence  results 
yield  more  easily  verified  conditions.  The  new  methods  are 
designated  as  the  minimum  d  and  the  approximate  minimum  d 
method.  We  proceed  with  the  development  of  needed  notation 
and  background. 


Following  Haberman  (  1974),  we  regard  a  contingency- 
table  as  a  k-vector  n,  n*  =  (n^,  ...,  n^)»  where  ni  denotes 

the  number  of  observations  in  cell  i  from  one  of  L 
multinomial  distributions .  Specifically,  the  sets 

■Cn^  i  €  Ij>,  /  =  1,  ...,  L,  define  independent  multinomial 
vectors, 

•Cn.:  i€I/>“M(N.,  n .  :  i€I,,n.  >0,  X  n  .  =  1 ) , 

l  t  xi  l  i  i6Ij 

where  N  =  ?,  n.,  and  I,,  ...,  IT  are  disjoint  and 

1  i  €  Ij  1  1  L 

exhaustive  subsets  of  <1,  ...,  k>.  Let  n '  =  (n,  ....  n.  ) 


and  p '  = 

Corresponding  to  n  and  p,  let  y* 


...»  PK>,  where  Pj^  =  n^N^  for  i  €  1^. 

=  (y,,  ...,  yv),  where 


yi  =  log  nA,  i  =  1,  ...,  k,  and  log  p’  = 

(log  p,,  ...,  log  p  ).  Finally,  for  a  k-vector  x,  let 
exp  x'  =  (exp  ...,  exp  x^). 


The  development  of  log-linear  models  was  heralded 
by  Bartlett  (1935)  and  by  Boy  and  Kastenbaum  (1956).  They 
define  interactions  in  terms  of  constraints  on  products  of 
elements  of  n.  Birch  (1963)  shows  that  these  constraints 

•m 


coincide  with  constraints  on  linear  combinations  of  element! 
oi  y.  Most  generally,  in  a  log-linear  model,  the  parameters 

y  are  constrained  ty  B„y  =0  ,  0  <  m  <  k  ;  when  »=0,  there 
—  —  i  „ m  ~ 

are  no  linear  constraints  on  y.  The  matrix  B^  is  assumed  to 
be  an  m  x  K  matrix  with  orthonormal  rows  and  to  satisfy 


B  A 
“1“ 


2mxL' 


where  A  is  the  k  x  L  matrix  given  by 


a 


if  = 


1,  i  €  I/, 

0,  otherwise. 


(1.1) 


(  1.2) 


We  denote  the  parameter  space  of  y  by  Let 


* 


fy : 


r 

e 


exp  y ,  =  1 ,  f  =  1 


' » L> , 


(1.3) 


ana 


re<!i>  *  <z!  5,z  -  2.-  2  6 


(1.4) 


Evidently 


r<B,>  =  r/v^s,). 


(1.5) 


Since  its  rows  are  orthonormal,  we  may  augment  £1 

•m  1 

V  • 

to  a  k  x  k  orthonormal  matrix  through  addition  oi  additional 
rows  3 ^  and  define 


B*  =  IB’,  B^J. 


i 

(1.6)  ! 


{(•parameterization  of  y  defines 


-  4  - 


4 


P 


y  # 


y  = 


H'm. 


Since  y  €  Tw ( B , )  implies  B,  y  -  0„ , 

»  *  **,  I  mm  1  «• 

H  =  (£l'  H2)  =  {2m'  t2)r 


(1.7) 

(1.6) 


(1.9) 


(1.6)  reduces  to 

y  =  E*  ^2,  d.io) 

and  (1.10)  provides  an  alternative  formulation  oi  the 
general  log-linear  model. 

There  are  many  methods  for  estimation  or  the 
parameters  of  a  log-linear  model,  the  method  of  maximum 
likelihood  being  the  test  known.  Estimation  equations  for 
the  general  log- linear  model  are  easily  derived  by  the 
methods  of  Birch  (1963)  or  LaGrange  (Apostol,  1957),  and, 
tor  1*1,  are  given  by 


B.  y  =  0  , 

—  I  -  -IT. 


B2  Ip  -  exp  y)  =  0^, 


(1.11) 


Generally,  equations  (1.11)  require  iterative  solution. 
Conditions,  under  which  solutions  may  be  obtained  explicitly 
are  available  (Andersen,  1974,  Bishop,  Feinberji,  and 
Holland,  1975). 


* 


3 

' 

i 
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Haberman  (1973,  1974)  and  Andersen  (1974)  consider 
the  existence  and  uniqueness  of  the  likelihood  estimator. 
Their  results,  in  our  notation,  are  summarized  in  the 


following  theorem 


Iksatsa  1  •J:  If  a  maximum  likelihood  estimator 
?  of  r  exists,  it  is  unique,  and  satisfies  (1.11). 

Furthermore,  any  of  . the  following  conditions  is  necessary 
and  sufficient  for  the  existence  of  a  likelihood  estimator: 

i)  there  exists  a  vector  ?  which  satisfies  (1.11), 

ii)  there  exists  a  vector  v  such  that  B„v  =  0.  ,  ana 

Pi  +  >  0,  i  *  1,  ...,  k,  and 

iii)  there  exists  no  vector  v  such  that  B^v  = 

0  vi  £  0,  i  =  1,  ...,  k,  v  *  0.,  and  p  v  =  0. 

Except  in  specific  cases,  the  conditions  of 
Theorem  1.1  are  difficult  to  apply  in  practice. 

The  parameters  of  a  log-linear  model  may  be 
estimated  also  by  the  principle  of  weighted  least  squares 
(Grizzle,  Starmer,  and  Koch,  1969,  Koch  filial. 

1977)  . 


Both  the  weighted  least  squares  and  the  maximum 
likelihood  approaches  lead  to  estimators  with  good 
asymptotic  properties.  In  Section  2,  we  introduce  the  two 
new  estimation  methods.  Existence  and  uniqueness  of  the 
estimators  are  discussed  in  Section  3  and,  in  Section  4,  it 
Is  shown  that  the  new  estimates  also  have  good  asymptotic 
properties . 


2.  HIRIHUH  d  ESTIHATIOH 


2.1  The  Rinieua  d  Bethod 

4 

Minimum  d  estimation  is  not  really  new,  as  d  is 
included  in  the  class  of  functions  studied  by  Neyman  (1949), 
Taylor  (1953),  and  Ferguson  (1958),  the  minimization  of 
which  leads  to  B.A.N.  estimators.  It  does,  however,  seem  to 
have  special  efficacy  for  log-linear  models  and  leads  to  the 
approximate  minimum  d  estimation  method  discussed  in 
Section  2.2. 

Let 

*  2 

d  (  y ;  r.)  =  I  n.  (log  p.  -  y.  )  (2.1) 

"  -  i  =  1 

=  (log  p  -  y)*N  (log  p  -  y), 

where  N  is  the  diagonal  matrix  with  entries  n^,...,  n^.  If, 
for  some  i,  n^  is  zero,  the  contribution  to  d(y;  n)  of  the 
ith  cell  is  taken  to  be  zero.  Further,  we  use 

0  (log  0)q  =  0  ,  (2.2) 

since 

lim  n. (log  p. )q  =  lim  N'  Ip.  (log  p^  )ql 
n . ->0  1  1  p . ->  0  * 

1  =  0,  i  €  I/#  /  =1,  ... ,  L. 

The  minimum  d  estimator  y  of  y  is  defined  to  be  that  point 
in  T(B1)  which  minimizes  d(yj  n).  Equivalently,  p  and  n 
denote  the  corresponding  estimators  of  (i  and  *  respectively. 
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In  analogy  to  the  likelihood  estimator,  the 
minimum  d  estimator  satisfies  a  system  of  equation  whenever 
it  exists.  This  system  is  given  by  the  following  theorem. 

lll££C£JS  2*1'  If  a  minimum  d  estimator  7  of  r 
exists,  then  it  satisfies  the  equations, 

P_[N(log  p  -  y  )  -  y  ( y )  )  =  0  #  (2.3) 

5,1-2.'  <2-“’ 


and 


1  exp  ) .  =  1  f  /  =  1 §  • • • #  L $  (2*5) 

i  €  I, 

where  y(y)  =  I  y  1  <  T  > ,  •«.,  yk<T)l, 

y,(y)  =  (exp  y.)  I  n  (log  p  -  y  )  (2.6) 

J  -  Ji€Ij 

tor  j  €  I  ^ ,  f  ”  1 ,  •••,  L  • 

fcCSfit*  The  theorem  will  follow  from  LaGrange's 
Theorem  (Apostol,  1957,  Th.  7-10;  Williamson  £t_£l. ,  1972, 
p.  595)  upon  verification  of  the  conditions  of  that  theorem. 
The  set  'ey;  yi  <  0,  i  *  1,  ...,  k>  is  associated  with  the 

open  set  of  the  theorem.  The  constraint  functions  are 

M' 


and 


-  8  - 


I  exp  y.  -  1,  /  =  1#  •••/ 
1  6 


and  they  vanish  on  f  (  B  1 ) . 


The  matrix 


(£l'  fl,  * 


•  •  9  e  ^  1  , 


(2.7) 


where  e^,  =  (e^,  ...»  e^). 


€/i  = 


i  €  I,, 

exp  rif  / 


i  €  I,, 


(2.8) 


is  of  full  ran*,  since  F1  is  of  full  ran*  and  it  may  be 
proven  that  no  row  is  a  linear  combination  of  rows  of  t 1 • 
The  proof  follows  by  contradiction.  Suppose  that 


u  '  "i  iir 


where 


denotes  the  3  row  of  B ^ .  Then  we  Know  that 


it  2k  =  '• 


e,  1  *  (  I  o.  B .  j )  1  w  =  S  1v> 

_/  -K  1  3  -1)  -K  j=1  3  -13  -K 


by  (1.3)  and  the  contradiction 


I.,*.* 


is  established 


Finally,  since  it  is  apparent  that  d  and  the 
constraint  functions  have  continuous  partial  derivatives 

on 


with 


respect 


to 


j  =  1  »  •  •  •  »  k  , 


<y:  0,  i  =  1,  k>,  the  conditions  of  LaGrange’s 

Theorem  are  met. 


Let 


*(y;  n)  =  d(y;  n)  *  d’B  y  ♦  Z  L.  (  Z  exp  y. 
-  ~  -  “  - /=1  1  i  €  It  1 


-  1) 


be  the  Lagrangian  function  for  the  minimization  of  d  subject 
to  y  €r(B1)«  Then  LaGrange’s  Theorem  implies  that,  given 

y  €  T  ( B  ) ,  values  of  the  m-vector  <t>,  and  A.,  ...,  A.  exist 

m,  I  M  1  X) 

satisfying  the  equations, 

d  v(y;  n)  ,  m 


d  r. 


y.y  =  -2n^(loc  Sj  -  tUJ 


+  A^  exp  Yj  =  0,  j  €  1  ^ ,  jf  —  1,  .  .  .  ,  L, 


(2.9) 


where  b^^  is  the  (i,j)  element  of  B^.  Summation  of  the 
subset  of  equations  (2.9)  over  j  €  1^  yields 


-2  Z 

j’®  I, 


n j ’ (log  p  •  -  yj*)  +  Lg  =  0, 


(2.10) 


the  term  involving  b^^'s  vanishing  due  to  (1.1).  With  the 
use  of  (2.10)  and  vector  notation,  (2.9)  becomes 
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-2  N  (log  p  -  y)  *  +  2y(y>  =  0R,  (2.11) 

where  y(y)  is  given  by  (2.6).  Premultiplica tion  of  (2.11) 
by  B ^  yields 

-2  B  N  ( log  p  -  y)  +  <t>  +  2  B,  y(y)  =  0m ,  (2.12) 

which  implies 

<t>  -  2  B 1 N  ( log  p  -  y)  -  2  P1  y(y).  (2.13) 

Substitution  oi  (2.13)  into  (2.11)  yields 

ii  -  a^HNdog  p  -  y)  -  y(y>)  =  0  ,  (2.1**) 

ana,  with  premultiplication  Ly  B  = 

[  N  (log  p  -  7)  -  y(y>]  =  (  2 . 1  o  ) 

Thus,  y  must  satisfy  (2.3),  and  the  constraint 
equations  (2.4)  and  (2.5).  This  completes  the  proof. 


!i 

*2 


(2.14)  reduces  to 


The  system  (2.3)  -  (2.5)  contains  k  ♦  L  equations. 
We  can  show  that  L  equations  of  (2.3)  are  redundant. 
Without  loss  of  generality,  we  take  the  first  L  rows  of 

as  proportional  to  the  rows  of  A  ,  where  A  is  given  by 


(1.2).  Then, 


03  i 


A  ( N (log  p  -  y )  -  y ( y ) ] 

I  I  n . ( log  p . 
i  €  I1  1 

[  I  n.(log  p.  -  y  .  )  1  ( 1  -  I  exp  y.)| 

i  €  I.  1  i  6  I  1  J 

L  L 

by  (2.5).  Computation  of  y  may  be  effected  through  solution 

of  (2.4) ,  (2.5),  and  the  last  (k  -  m  -  L)  equations  oi 

(2.3) . 

2.2  The  Approxiaate  Minimum  d  Hethod 

Calculation  of  the  minimum  d  estimator  ecessarily 
involves  solution  of  equations  (2.4),  (2.5),  and  the  last 
(k  -  m  -  L)  equations  of  (2.3).  As  this  system  is 
nonlinear,  its  solution  may  prove  to  be  a  formidable  task 
indeed.  In  this  section,  we  develop  an  approximation  to  the 
minimum  d  estimator,  one  which  is  examined  as  a  new 
estimator.  The  method  is  of  interest  because  the 
approximation,  unlike  the  minimum  d  estimator,  is  relatively 
easy  to  compute,  and,  like  the  minimum  d  estimator,  has  gooc 
asymptotic  properties.  Furthermore,  an  approximate  mininuc 
d  estimator  always  exists. 

As  noted,  minimization  of  d  over  T ( B  ^ )  may  prove 

to  be  a  difficult  problem.  A  somewhat  simpler  problem  is 
the  minimization  of  d  over  r j,  ( B 1 ) ,  and  this  is  the  idea 

behind  the  approximate  method. 


l~<  I 
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Let  y  denote  the  approximate  minimus  6  estimator. 

mm  & 

Roughly,  to  calculate  y  ,  we  first  obtain  the  "projection" 

mm  A 

of  log  p  in  r  (8  )  through  the  minimization  of  d(y;  n)  over 

—  «  M  i  mm  «• 

rM  (B 1 ) .  Let  y  denote  the  result  of  this  minimization.  The 
approximate  minimum  d  estimator  y  is  the  "projection"  of 

mm  & 

in  rc  obtained  through  minimization  of  d*(y;  7> 

K  _  2  - 

X  (exp  y.  -  exp  y . )  /(exp  y,  )  over  rQ  (see  Figure  A). 
i=1  1  1  1  * 

The  description  in  the  preceding  paragraph  is 

oversimplified  in  that  the  existence  and  uniqueness  of 

and  y  are  tacitly  assumed.  In  the  next  section,  we  show 
_a 

that  such  points  do  exist,  though  they  need  not  be  unique. 
Kore  formally,  let 

F  =  <y:  y  minimizes  d  over  ffj(B1)>,  (2.16) 


and 


c(y)  =  A’  exp  y. 


(2.17) 


where  A  is  given  by  (1.2).  An  approximate  minimum  d 
estimator  of  y  is  given  by 

y,  s  y  -  Allog  c(y>), 

mm  mm  mm  mm 


(2.18) 
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where  (log  c(7)l'  *  Hog  c^Cy),  ...»  log  cL(y)I,  and  y  €  T. 
He  denote  the  set  of  points  given  by  (2.18)  as  T  and  let 

»  9 

U  and  p  denote  the  corresponding  estimators  of  n  and  ii 

M  A  —>  m, 

respectively . 


It  is  clear  that  "projection"  of  log  p 
into  Tjj(  )  lead  to  the  set  f.  In  the  following  lemma, 

we  prove  that  the  "projection"  of  y  €  r  in  obtainta 

through  minimization  of  d*(y;  y)  over  T  is  the  point  y 

given  by  (2.18).  He  then  prove  that  any  element  of  T  lies 
in  HB.,)  ,  as  depicted  in  Figure  A. 


The  minimum  oi 
k  „ 

d*(y;  y)  =  Z  (exp  y.  -  exp  y.)*/exp  y.  over  r_  is 

**  ~  i»l  1  1  A  6 

attained  at  y  where  y  is  given  by  (2.16)  for  any  vector 

«  •* 3 


rs  is 


y  «  r. 


E£22£*  let  y^  be  any  element  of  rs  There  exists 


a  k-vector  e  such  that 


exp  70i  “  ei  *  exp  Tai' 


Since  y.  and  y  are  elements  of  r~  it  may  be  shown  that 

—  0  — a  2  / 


Z  6.  s  0  t  /  =  1#  •••#  La 

1  € 


(2.19) 


mt.  a 


*  -  2 
d*(yQ;  7>  =  2  CeA  ♦  exp  rfti  -  exp  rL )  /exp  >i 


*  2 

=  d*  <  y . ;  y)  +  I  e./exp  y. 
-d  i=1  1 

i  L 

r  -  2  l  X  e.[1  -  exp  ya</exp  y,J 

i  i  e  I.  1  ai  1 


i 

k 

2  — 

;• 

! 

I 

• 

*  d*  ( y . ;  y  )  +  I  < 

““  -  i=1 

ei/exp  yi 

l 

4 

L  -  -1 

-  2  1  (1  -  l c .  <  y ) )  >  I 

f= 1  ~  i  € 

M 

ei 

f 

■e 

<C 

S 

by 

(2.17)  and  (2.18).  Finally,  by 

(2.19),  we  have 

k  j  - 

d*(yQ;  y)  =  d*(ya ;  y)  + 

X  e  /exp  y. 
i=  1 

% 

i 

1  d*(ya;  7> 

i 

I 

♦ 

i 

and 

the  proof  is  complete. 

I 

..IllfifiCfiB  la  €  r 

lies  in  r(B. ) . 
a ,  -1 

j 

£tSCl:  By  (2.17)  and  (2.18)  y#  lies  in  T 

show  y#  lies  in  ( B ^ ) #  we  note  that 


I 
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B1Ta  =  B  ^y  -  B^Aflog  c(y)J, 


0 

a 


by  (1.1)  and  (2.16). 

3.  EXISTENCE  AND  UNIQUENESS  OF  THE  HINIBUM  6  AND 
APPROXIMATE  BINIHUH  d  ESTIBATO&S 

3.1  Introduction 

When  each  cell  of  the  contingency  table  contains 
at  least  one  observation,  likelihood,  minimum  d,  and 
approximate  miniaum  d  estimators  may  be  shown  to  exist  and 
to  be  unique.  Problems  may  arise,  however,  when  some  cells 
are  empty.  In  this  section,  we  show  that  an  approximate 

minimum  d  estimator  always  exists,  and  give  a  usable, 

necessary  and  sufficient  condition  for  the  existence  of  the 
minimum  d  estimator.  The  minimum  d  estimator  is  shown  to  be 
unique  whenever  it  exists.  The  approximate  minimum  d 
estimator  need  not  be  unique;  a  necessary  and  sufficient 
condition  for  its  uniqueness  is  given. 

Throughout  this  section,  we  shall  assume  that 

has  less  than  k  -  L  rows,  for  when  B,,  has  k  -  L  rows,  HE.,) 

contains  only  a  single  point,  and  likelihood,  minimum  d  and 
approximate  miniaum  d  estimators  must  exist  and  be  unique. 

We  make  use  of  the  following  notation  and 

definition  in  this  section.  In  analogy  to  HB^),  for  an  L- 

vector  a,  >  0,  /  «  1,  ...,  L,  let 


r 


CC  I 
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,,  a)  =  {y:  B.y  =  0  ,  I  exp  y <  =  o.»  /  =  1,  ...»  LJ. 
1  -  -1-  -m  i  €  l(  1  1 

(3.1) 


In  this  notation  *  r < E  ^ ,  1  ^) . 

PtHnii  fjon  2*1 :  A  sequence  of  k-vectors  fxr>r_1 
is  said  to  possess  the  star  property  if 

i)  <xrj>”_1  either  converges  or  diverges  properly,  j  -  1, 
...»  X  t  and 

ii  )  there  exists  j  such  that  lim^  sup  x^  -  -  00  . 

3.2  Uniqueness  of  the  Kinieus  6  Estieator 

ItlS21L£!D  If  a  minimum  or  d  over  T(B  )  exists, 

it  is  unique. 

To  prove  Theorem  3.1,  ve  require  a  sequence  of 
three  lemmas: 

^*1*  let  9  »/  ~  1,...,L,ni>0,i=1,...,X, 

and  q  be  a  positive  integer.  The  function, 

K 

<t>  ( yi  n)  *  I  n.(log  p.  -  y,)  ,  (3.2) 

q  -  -  i  =  i  1  1 

attains  a  minimum  over  r(B..,  a), 

el  w 


I 
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lemma  3*2*  Let  0  <  £  1 .  For  any  /,  /  =  1,  ...» 


♦  <1/)(y}  n) 


l  n4 ( log  ps 

ie  h 


7i>  1  105  Of 


(3.3) 


for  y€  rs(a/) 


<y:  2  exp  y,  =  <*->, 

"  1  6  1/ 


lemma  3* 3s  (i)  Let  y  €  f^( ) O  fs(o^ ) ,  with 


0  <  Bf  ^  1.  Then 


(/)  2 
d  (y;  n)  =  I  n. (loo  p.  -  y . ) 

-  -  i  €  I, 

i  d(* } (y  -  loo  A ^ ;  n). 


with  strict  inequality  if  e^  <  1. 


(ii)  If  y  €  T (B ,  a),  0  <  a,  4  1,  /  *  1 ,  ...,  L,  with  a.  <  1 

•  «•  1  #  — ’  A  I 


for  some  /,  then 


d(y;  n)  >  d(y  -  A  loo  o;  n), 


(3.4) 


The  proofs  of  each  lemma,  and  then  of  Theorem  3.1  are 
straiohtf orward,  and  so  are  omitted. 
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3.3  Existence  end  Uniqueness  of  the  Approximate 
Hinieue  6  Estimator 

The  main  results  of  this  section  are  that  an 
approximate  minimum  6  estimator  always  exists,  and  that  it  f 

is  unique  if  and  only  if  ( N  B2')  is  nonsinqular.  These 

results  follow  from  Lemma  3.4,  which  describes  the  nature  of 
T  in  (2.16),  and  is  stated  without  proof. 

Lqmma  1.4:  The  set  f  is  a  nonempty,  affine  set  of  j’ 

dimension  k  -  s  -  rk  (B^  N  B^)  and  is  given  by  j 

31 

«■ 

* 

r  =  iy:  y  =  K  B2,)-  B2  N  1o3  P  (3.4)  * 

* 

*  a  -  <B2  S  B2*>"  (B2  N  B  *>  Z>),  Z  €  Ek-m>, 

where  the  notation  is  such  that  h~  is  a  generalized  inverse 
of  A.  If  (E-  N  B,‘)  is  of  full  rank,  then  f  is  a  singleton 

<■  M  4  M  A 

set.  Furthermore,  if  ys?,  c^Cy)  2  1  ,  where  c^y)  is  the  $ 

Ith  element  of  c(y)  given  by  (2.17). 

•m  aa» 

Xll££££l  1*2:  The  set  r  is  nonempty.  Furthermore 

A 

T  is  a  singleton  set  if  and  only  if  rk(B  N  B  • )  =  k-m  . 
a  -2  -  -2 

The  theorem  follows  immediately  from  Lemma 

3.4  when  it  is  noted  that  each  element  y  of  T  leads  to  a 
distinct  y  z  r  , 

w  Ol 


through  use  of  (2.18) 
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3.4  Existence  of  the  Minimum  d  Estimator 


The  existence  of  the  minimum  d  estimator  depends 
upon  the  space  T  ,  which  was  defined  in  Section  2  and  whose 
properties  were  noted  in  the  previous  section.  Let 

T*  =  {y:  yeT,  I  expy.  ~  cf,  t  -  1,  ...,  L}  ,  (3.5) 
“  -  id, 


where 


in f_  <c.(y)>  *  inf  <  I 
yeT  “  yef  i*I 


exp  yi>/  / 


... L , (3.6) 


In  this  subsection  we  show  that  T*  is  either  an 
empty  or  singleton  set,  and  that  a  minimum  d  estimator 
exists  if  ana  only  if  F*  is  a  singleton  set. 


ISJ&JUs  3*5:  If  T*  is  not  empty,  then  it  is  a 
single  point. 

2£££l:  The  proof  is  by  contradiction.  Suppose 
that  y1  and  y2  are  both  contained  in  T  and  that  y1  t  y 2» 


Since  T  is  an  affine  set  (Lemma  3.4),  y3  =  ^(y^  y2>  €  T. 

Furthermore,  since  the  exponential  is  a  convex  function, 
there  exists  /  for  which 


C/  =  2  1  1 
*  3=1  i  6  I, 


exp  yjt  > 


i  e  I, 


exp  y. 


(3.7) 


But  (3.7)  contradicts  the  fact  that 
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cf  =  inf 

*  -  —  i 

r€  r  1 


i 

€  i. 


exp  yt 


1 

i  €  I, 


exp  y 


li 


I 

i  €  I, 


exp  y 


2i ' 


and  the  lemma  is  proved. 

Let 


rc(.i'  V 


=  <y: 


5l  2  =  2m 


I 

i  €  I, 


exp  y ^  £  a / 


=  1, 


L>, 


(3.8) 


The  following  is  a  technical  result  needed  to  prove  the  main 
result  of  this  article. 

i.fi:  Suppose  that  d  has  a 

over  r_(E.,  a).  Then,  if  {y*>°°_-,  is 

c  -I,  ~  -r  r-i 

points  in  T_(B  a)  with  the  star-property, 

L  «.  1  ,  — 

limrinf  d(y*;  n )  =  00  . 

We  now  show  that  the  nonemptiness  of  r  is 
necessary  and  sufficient  for  the  existence  of  a  minimum  o 
estimator . 


unique  minimum 
any  sequence  of 


Ih£Cl£JD  3*Ji!  *  minimum  d  estimator  exists  if  and 

only  if  T*  is  nonempty. 

fEQQjs  Suppose  that  T*  contains  the  point  y*. 
Then  y*  minimizes  6  over  r_(B,  c)  and  is  unique  by  Lemma 

«  l  •»  I  I  • 
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3.5.  If  <rr>“_1  is  a  sequence  of  points  in  r < B ^ )  with  the 
star  property,  then,  by  Lemma  3.6, 

lim  inf  (y  ;n)  =  »  .  (3.9) 

r  -r  - 


Note  that  a  *  -loq(A  A*  1k>  €  T(B1),  and  that  d(a  ;  n)  is 

finite.  This,  the  fact  that  <J  is  continuous,  and  (3.9) 
imply  that  a  minimum  of  d  over  T(B1)  exists.  This  proves 

sufficiency . 

CO 

Now  suppose  that  T*  is  empty,  ana  let  <y  >  , 

•• »  r  .i. 

be  a  sequence  of  points  in  f  such  that 


lim  I  exp  y  .  ~  c.,  I  =  1,  ...,  L.  (3.10) 

r  i  e  i  * 

From  the  continuity  of  the  exponential  function  and  the 
definition  of  in  (3.6),  such  a  sequence  must  exist  and, 

since  T*  is  empty,  must  possess  the  star  property. 

Let 


=  Yr  -  A  log  c(yr>,  r  *  1,  2,  ...,  (3.11) 


and  note  that  rc*  6  T(B1),  r  *  1,  2,  ...  •  To  prove 
necessity,  we  need  to  show  that 
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lim^inf  n)  <  »  . 


(3.12) 


To  prove  (3.12),  ve  note  that 


d(y  *;  n)  *  <)(y  ;  r.)  +  I  (log  c  .)2  1  n. 

/=1  r*  i  e  1 

1 


+12  log  c  .  I  n,(log  p.  -  y.), 
f=1  rl  i=I/  1  1  1 


(3.13) 


where  c 


r/ 


i  €  I 


exp  y  ,  ( 


“  1,  .  .  .  /  Li 


Since 


I  n  (log  p  -  y  ) 
i  €  ^  1 


I  n. (log  p  -  y. ) 

<i  6  1^ : | log  Pi-yil^1>  1  1  1 

+  1 


<i  6  lti |log  Pi-yil<1> 


ni(log  pL  -  yA) 


<i  6  I/:|log  Pi-y1l^1> 


ni(log  pi  -  Yi)  ♦  Sf 


S  Z  n. (log  p,  -  y. )2  ♦  N#,  l  -  1#  ...»  L, 

•  i  €  I,  1  1  1  1 


(3*13)  inplies  that 
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d ( y r * ;  n)  £  d(rr;  n)  +  I  N^(log  cr/^2 

+  2  log  C  [d(yr;  n)  4  N)»  (3.14) 


where 

C  *  max  c  .  .  (3.15) 

r  f  =  1 , . . .  ,  L  rf 

Note  that  d(y  ;  n),  N,  and  N,,  /  =  1,  ....  L  do  not  depend 

m»L  0m  A  ■* 

on  r,  and  that  and  tend  to  finite  limits  by  (3.10). 

Thus  < 

h 

i 

lim^inf  d(y^*;  n)  <  lim^inf id  ( yf ;  n) 

L 

4  l  HA(log  crJl)  +  2  log  Cr[6(yri  n)  +  fl  ]} 

<  00  i 

which  establishes  (3.12). 

We  have  exhibited  a  sequence  of  vectors  {y  *>°°  „ 

~r  r=  1 

in  r<E^),  and  hence  in  ^(B^,  1^),  which  violates  the 

conclusiofi  of  Lemma  3.6,  and  hence  implies  that  no  unique 
minimum  of  d  over  1  ;)  exists.  Thus,  either  no 

minimus  exists  or  the  minimum  is  not  unique. 

Suppose  two  distinct  vectors  y  and  y,  which 
minimize  d  over  f^B.j  1^)  exist.  By  part  (ii)  of  Lemma 
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3.3,  and  y ^  must  lie  in  rtB^.  But  this  contradicts 
Theorem  3.1,  which  states  that,  if  a  minimum  of  6  over 

exists  it  is  unique.  Thus  no  minimum  exists.  This  proves 
necessity,  and  the  proof  of  the  theorem  is  complete. 

The  following  corollary  is  useful  and  follows  from 
Theorems  3.2  and  3.4. 

Corollary  3.M:  If  (B2  N  E2'  )  is  of  rank  k-a, 

then  a  minimum  d  estimator  exists. 

4.  ASTHPTOTIC  PBOPEBTXES 

As  the  sample  sizes  become  large,  since  ir  >  0, 
the  probability  that  each  cell  in  the  contingency  table 
contains  at  least  one  observation  tends  to  one.  Therefore, 
the  probability  that  the  defined  estimators  exist  and  are 
unique  tends  to  one  by  Corollary  3.4  and  Theorem  3.2.  Thus 
we  assume  that  the  various  estimators  exist  and  are  unique 
and  consider  their  asymptotic  properties.  Further,  we 
assume  that  there  exists  an  L-vector  e,  e^  >  0,  /  =  1,  ..., 

L,  such  that 

lim  N^/N  =  e^, 

N  ,  N  >°° 


where  N.  denotes  the  sample  size  for  population 
1  L 

/,  /  *  1,  ...,  L,  and  N  =  r  N.  denotes  total  sample  size. 

1  * 

Two  main  results  are  given  in  this  section. 
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The  first  result  is  Theorem  4.1.  Two  sequences  of 

OO  #00 

random  K-vectors,  and  * Yk.  >M_,  ,  are  said  to  be  a  - 

limit  equivalent  in  probability  (a^  -  l.e.p.)  if 


VI*  -  In 


(4.1) 


i&§2££]&  4 .  ,1  s  Any  pair  of  the  maximum  likelihood 
estimator  y,  the  minimum  6  estimator  y,  the  approximation 

y  ,  and  y  are  /N  -  l.e.p.  . 

“a 

£1221:  We  prove  that  y  and  y  are  /N  -  l.e.p.  tc 

give  an  indication  of  the  method  of  proof  of  the  various 
pairwise  results. 

Likelihood  estimation  is  reviewed  in  Section  1. 
As  a  slight  generalization  of  (1.11),  we  have 


5i  !  = 


o  , 

_m 


(4.2) 


and 


B_  E  (p  -  exp  ?)  =  0, 

-  I  mmm  mm  K HI 


(4.3) 


where  E  denotes  the  diagonal  matrix  whose  i^*1  element  equals 
e^,  i  €  Ig,  ...»  L.  From  the  definition  of  y  in  Section 
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2,  the  minimization  procedure  yields 


and 


(4.4) 


b2  Naos  p  -  r)  =  oK_m 


(4.5) 


peplacement,  in  (4.3),  of  exp  i  =  1,  ...,  x, 

by  its  expansion  about  log  p^  yielcs 


B2  E  fpj  “  Px  ~  P^T^  ~  log  Pi) 


2  (exP 


*i><?i  ' 


log 


=  -K-1  <B2  N(? 


log  p ) 


E 


a. 


log  p 1  ) 2 
log  pR)2 


} 


-K-m ' 


(4.6) 


where  *  is  the  diagonal  matrix  with  elements  exp  v^,  •••* 
exp  *  ,  and  **  =  (*.j,  ...»  Is  such  that  ^  lies  between 
log  p  ^  and  y  ^ ,  i  s  1,  . . . ,  X. 
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Subtraction  of  N  times  the  expression  in  (4.6) 
from  (4.5) ,  and  subtraction  of  (4.4)  from  (  4.2)  yield 


fi,(?  -  T>  =  0 


(4.7) 


log  p1)£ 


<E2  H < 5  -  T 


y)  ♦  l2  In  I 


(4.8) 


yk  -  loo  pK)‘ 


Slutsky's  Theorem  (Serfling,  1980,  p,  19)  implies  that 


<N'1/2  B2  N(?  -  y)  -  /N  B2  n  E(?  -  y)>  ->  0^,  (4.9) 


where  n  denotes  the  diagonal  matrix  with  elements 

P 

^ ,  ...,  since  =  p^^  N^,  i  6  I and  p^  ->  n^. 

To  complete  proof  of  this  part  of  the  theorem,  it 
remains  to  show  that 


A 


I 
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A  B  E  v 
~2  ~  ~ 


(?1  -  log  p.,)' 


- 


log  p.  )  ‘ 

*■  1 


->  2n- 


Since  ?  and  log  p  are  consistent  estimators  of  y, 

and,  since  Af^  -  y  and  A(log  Pi  -  y  have 
distributions,  Slutsky's  Theorem  determines  that 


A 


- 


log  p,,) 


log  )  2 


“>  2k 


Finally,  (4.7)-(4.10)  imply  that 


A 


b2  n  E 


£i 


(I  "  _>  2k# 


and,  since 


b2  n  e 


2i 


is  of  full  rank. 


(4.10) 


P 

*  ->  y , 
limiting 


(4.11) 


(4.12) 
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P 

N<?  -  y)  ->  0^, 


and  the  proof  is  complete. 

The  second  theorem  of  this  section  follows. 


p 

d(y*j  n)  -  X2(y*;  n)  ->  0,  (4.13) 


where  the  junction 

2  L  2 
X  (y;  n)  *  I  1  ti  .  (p1  -  exp  y.)  /exp  y^^ 

-  ~  f=1  i6  I 

* 

is  the  well-known  Pearson  function,  and  y  denotes  any  of 

the  likelihood,  minimum  d,  or  approximate  minimum  d 
estimators* 

* 

This  result  is  proven  by’ expansion  of  exp  y^  in 
the  numerator  ofX^(y  ;  n)  about  log  p. . 

Theorem  4.2  allows  us  to  test  the  null  hypothesis, 
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*  * 

where  B1  is  an  *  x  k  matrix  with  orthonormal  rows  such  that 

* 

B1  A  =  0  „  ,  against  the  alternative  hypothesis, 

“ '  '  “m  x  L 


Define  T  =  d(yHQ;  n)  -  d  <  >  Ha ;  n), 

*  * 

where  y^0  and  >Ha  are  estimates  of  y  under  Hp  and 

respectively.  The  limiting  distribution  of  test  statistics 
2 

based  on  x  ,  and,  in  view  of  Theorem  4.2,  of  T  are  given  by 

Hitra  (1958)  and  Diamond  (1963)  for  various  m  and  "local 

alternatives".  In  particular,  under  H  ,  T  has  a  limiting 

*  ^ 

chi-squared  distribution  with  m  degrees  of  freedom. 


5.  COICLODIIG  BEHABKS 

The  emphasis  in  this  paper  has  been  on  the  basic 
properties  of  the  minimum  d  and  approximate  minimum  d 
estimators.  Their  existence  properties  have  been  considered 
and  they  have  been  shown  to  have  large  sample  properties 
equivalent  to  those  of  likelihood  estimators.  The  minimum  d 
and  approximate  minimum  d  methods  lead  to  asymptotic  chi- 
squared  tests  analogous  to  those  of  the  Pearson  goodness  of 
fit  statistic. 


A  second  manuscript  will  deal  with  the  application 
of  log-linear  aodels  and  the  new  estimators  to  the 
classification  problem  of  Martin  and  Bradley  (1972).  The 
existence  results  and  test  procedures  will  be  illustrated 
there/  and  extensions  to  the  problem  considered  by  Martin 
and  Bradley  (1972)  given. 
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